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Polynomial conjoint analysis of similarities: 
A model for constructing polynomial conjoint 
measurement algorithms • 



A model permitting construction of algor- 
ithms for the polynomial conjoint analysis of 
similarities is presented. This model, which 
is based on concepts used in nonmetric scaling, 
permits one to obtain the best approximate sol- 
ution. The concepts used to construct nonmet- 
ric scaling algorithms are reviewed. Finally, 
examples of algorithmic models for nonmetric 
scaling, multidimensional unfolding, conjoint 
measurement, factor analysis, subjective expec- 
ted utility, and the Bradley-Terry-Luce choice 
problem are presented. 



In his paper on polynomial conjoint measurement, Tversky 
(1967) indicated that one of the important unsolved problems 
faced by his and similar measurement models is ^the construction 
of algorithms for obtaining numerical solutions commensurate 
with the model. It is the purpose of this paper to indicate a 
general solution to this problem. 

The first section of this paper presents a brief review of 
the polynomial conjoint measurement model proposed by Tversky. 
In the next section, it is noted that the piobZeut of algorithm 
construction has been solved for one polynomial conjoint 
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neasurement model, the nonmetric multidimensional scaling model. 
A thorough review of the concepts of nonmetric scaling algorithms 
is presented, and it is proposed that the same concepts can be suc- 
cessfully adopted for a ^vide range of polynomial conjoint measure- 
ment models. In the next section of the paper a model permitting 
the construction of algorithms for polynomial conjoint analysis 
is presented. In the final section severa] examples of specific 
submodels are presented. 



Polynomial conjoint measurement . 

Tversky (1967) noted that one of the goals of scientific 
investigation may be regarded as the decomposition of complex 
phenomena into sets of basic factors according to some specified 
rules of combination, when the factors can be measured indepen- 
dently one desires to account for their joint effects by the 
appropriate combination rule. It is often the case, however, 
that the factors cannot be measured i. dependent ly, and that only 
the order of their joint effects is known. In this case it is 
desirable to be able to simultaneously reduce the complex phen- 
omena to its basic factors and to obtain a mea&arement of these 
basic factors such that the combination of the lactors accounts 
for the order of the observations. This is the conjoint mea- 
surement problem, and the comb'" lation rule is known as the con- 
joint measurement model. 

In particular, a data matrix meets the requirements for 
polynomial conjoint measurement if some monotonic transformation 
of the data matrix can be decomposed into several factors. The 



ERIC 




(3) 

decomposition rule must be some specified series of sums, differ- 
ences, and products of the factors themselves. Such a decompos- 
ition rule is called a polynomial function. 

In his paper, Tversky investigated the necessary and suffi- 
cient conditions under which a data matrix can be represented by 
a polynomial conjoint measurement model. It is not the purpose 
of this paper to delve into these conditions, but rather to pre- 
sent a method for measuring the factors and their effects, con- 
ditions permitting. If the conditions do not permit such mea- 
surement, then the method to be presented obtains a least 
squares estimate of the measurements and their effects, as well 
as providing information concerning the accuracy of the estimates. 

In his paper, Tversky (1967) presents several examples of 
polynomial conjoint combination rules. These rules include the 
Hullian and Spencian performance models cited in Hilgard (1965) , 
the Bradley-Terr; -Luce choice model (Luce, 1959), the subjective 
expected utility model (Savage, 1954), and the nonmetric multi- 
dimensional scaling models (Coombs, 1964; Shepard, 1964). For 
one of these models, the nonmetric multidimensional scaling 
model, the computation problem has been thoroughly investigated 
(Guttman, 196 8) and several computer programs exist (Kruskal, 
1964; McGee, 1966; Lingoes, 1965; Young, 1-^68). The relationship 
between several of the algorithms has been investigated by Young 
and Appelbaum (196 8) . 

It is the hypothesis of this paper that the general approach 
to construction of algorithms for nonmetric multidimensional 
scaling may also serve as an approach for constructing algorithms 
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for polynomial conjoint analysis. In fact, the former is a 
special case of the l..^ter. When the method for constructing 
algorithms for nonmetric scaling is understood, and when the rela- 
tion between nonmetric scaling and polynomial conjoint analysis 
is understood, then it is clear what steps must be taken to 
generalize nonmetric scaling algorithms to obtain polynomial 
conjoint analysis algorithms. 

Nonmetric scaling algorithms . 

In 1962 Shepard introduced the first algorithm for nonmetric 
multidimensional scaling. He stated that the goal of this anal- 
ytic method was to derive the metric structure of an unknown con- 
figuration of points in a Euclidian space of unknown dimensional- 
ity on the basis of nonmetric information about the proximity of 
the points. That is, Shepard 's method attempted to simultaneously 
convert the proximity measures into Euclidian distances, and to 
obtain the coordinates underlying the distances. In polynomial 
conjoint measurement terms, the Shepard method, by applying a 
Euclidian combination rule, obtained the factors (coordinates) 
whose effects (Euclidian distances) were monotonic with the prox- 
imity measures. In matrix notation, Shepard 's developments can 
be expressed as 

S S D = f (X) , (1) 
where S is the symmetric matrix of proximities between £ points, 
D is the p-order symmetric matrix of Euclidian distances, and X 
is the rectangular matrix of r-dimensional coordinates with £ 
rows and r columns. The symbol = is used to indicate that the 
matrix D is monotonic with the matrix S. That is, if s. .>s, ^ , 
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then d^^^dj^^. As indicated in the equation, matrix D is related 
to the matrix S through the function f . The function is the 
Euclidian distance function, and is performed on corresponding 
elements in all pairs of rows of X. The function is defined as 



f(X) = 



r 

y 

a=l 



(x. -X. ) 
la J a 



1/2 



for i,j=l. 



(2) 
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Notice that for Shepard's developments the monotcnicity 
require:nent is actually a weak decreasing monotonicity require- 
ment. That is, his requirement is weak in the sense that two 
distances may equal each other even though the two corresponding 
proximities do not, and his requirement is decreasing in the 
sense that smaller distances correspond with larger proximities. 

The analysis of proximities, as represented by equations 
(1) and (2), served as the basis for the development of a method 
by Kruskal (1964a; 1964b) which became known as nonmetric multi- 
dimensional scaling. Perhaps the most important difference 
between the two methods is that Kruskal desired to obtain a 
matrix of distances that was a least squares fit to a matrix 
representing a ir.onotonic transformation of the similarities. 
Notice that this differs from the Shepard approach by introduc- 
ing an objective definition of the best solution. As a by-pro- 
duct of objectifying uhe definition of the best solution, Kruskal 
found it necessary to introduce a new matrix. This matrix, 
called the matrix of disparities by the current author (Young, 
196 8b) , allowed Kruskal to perform computations on numbers which 
v/ere i.onotonic with the similarities without actually violating 
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the ordinal assumptions about the similarities. 

A second important difference between the Shepard and ■ 
Kruskal methods is that Kruskal generalized his definition of 
the distance function to include all Minkowski spaces. The 
fimiliar Euclidian space is a special case of the more general 
Minkowski space, as is the "city-block" space used by Attneave 
(1950) . 

In matrix notation, Kurskal's developments can be expressed 

as 



m 



S = A = D = g(X) 



(3) 



where A is the matrix of disparities (symmetric with p rows and 
columns), and where the symbol = indicates a least squares 
approximation. The matrix D is related to the coordinates X 
by the function £. This is the Minkowski distance function and 
is defined as 



gfx) = 



1. 

I (X. -X. ) 



1/c 



, for i, j=l,2,. 



(4) 



where the function is defined for corresponding elements in all 
pairs of rows of X, and where c is the Minkowski constant such 
that c = 1. 

In summary, nonmetric scaling, as represented by Kruskal' s 
developments, allowed the analysis of similarities in any 
Minkowski space, such that the best possible monotonic trans- 
formation was obtained. In polynomial conjoint measurement terms, 
Kruskal 's nonmetric scaling, using a combination rule defined by the 
Minkowski distance function in equation (4), was able to 



ERIC 



(7) 

simultaneously obtain the factors (coordinates, and their effects 
(Minkowski distances) such that the effects were monotonic with 
the data matrix (similarities) . 

Following Kruskal's developments several investigators have 
introduced analogous methods of analysis (Lingoes, 1965; McGee , 
1966; Young, 1968a). An extremely thorough discussion of the 
general considerations for constructing nonnetric scaling algor- 
ithms has been presented by Guttman (1968). The relations among 
several of the methods have been discussed by Young & Appelbaum 
(1968) . 

In the next section of this paper it is shown how equations 
(3) and (4) can be generalized in order to apply th^ well under- 
stood methods of nonmetric scaling algorithms to polynomial con- 
joint analysis of similarities. 

Polynomial conjoint analysis of similarities . 

The model for constructing algorithms for polynomial conjoint 
analysis of similarities involves two fundamental generalizations 
of the nonretric scaling model. One of these generalizations in- 
volves modifying the function relating the matrix X to the matrix 
D, and the other generalization involves removing the restriction 
that the matrix S be a symmetric matrix. 

Analysis of rectangular macx - .ces . The key to understanding 
the generalization of the method co include rectangular matrices 
is the concept of a sapermatrix. It will prove useful to re- 
write equatioa (3) in supermatrix notation as 
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^11 I ^2 



• 11 ' -12 "li ' "12 

m . 



= g 



s. 



21 



22 



^21 I ^22 



^21 I °22 



fx. 



X, 



(5) 



That is, we are re-defining the matrix S of similarities as 
being a supermatrix, and in a parallel manner are re-defining 
the matrices A, D, and X as being supermatrices . 

Consider each submatrix in equation (5) . The matrix S^^^^ 
contains the similarities of one set of stimuli, let us say 
set 1. Notice that the similarities are of the stimuli within 
set 1. The matrix S^^ contains parallel information for the 
stimuli within set 2. Beth these matrices are necessarily sym- 
metric. We will denote the number of rows and columns in S^^^ 
as and the number of rows and columns in S _ as £ . Turn- 

ing our attention to the matrix S^2' notice that it contains 
similarities between stimuli in sets 1 and 2. This matrix is 

rectangular with £ rows and p columns. We note that S_, is 

■'■ 2 1 

simply the transpose of S^2- '^'he same relationships hold for 
the matrices of disparities and distances. In a corresponding 
manner, the matrix contains coordinates for the stimuli in 
set 1 and Y.^ for the stimuli in set 2. , therefore, has £^ 
rows and r columns, and Y.^ has rows and r columns. 

The final step in generalizing the Kruskal model to in- 
clude the analysis of rectangular matrices as v/ell as symmetric 
matrices is to assume that there is no information concerning 
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the similarities within sets 1 and 2. On the basis of this 



assumption we write 



^12 - -'12 = ^12 = 5' 



(X 1 
_1 



(6) 



where S^^ a rectangular matrix of similarities which is mon- 
otonic with a^^, the rectangular matrix of disparities. The 
matrix h^^ is, in turn, a least squares approximation to D,,, , 
the rectangular matrix distances. The distances in D^^, in turn, 
are between the points in set 1 (whose coordinates are repre- 
sented by X^) and those in set 2 (whose coordinates are repre- 
sented by X^). The definition of the function g. is slightly 
modified so that each row in X^ is compared with each row in X^ : 
we denote the new function g' and it is defined as 



ERIC 



g' (X) = 



I (X. ~x. ) = 

a=i 



l/c 



i - 1 , 2 , . . . ,£ 
/for ^ . (7) 

j = 1 , 2 , . . . , £^ 



It should be noted that by applying function as defined by 
equation (4) to the submatrix X^^ we obtain 

= g(Xi), (8) 
and applying it to we obtain 

^22 " ^^^1^ • (9) 
Let us look at matrices X^ and X^ for a moment. Both 
matrices have r columns corresponding with the dimensionality 
of the space the analysis is being performed in. The number of 
rows in X^^ corresponds with the number of rows in S^^' whereas 
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the rov/s of correspond with the columns of S^^. That is, the 
matrix may be thought of as representing the row effects of 
data matrix S^^, and represents the column effects of the data. 
Note that X^ and must be of the same dimensionality and are 
determino.d up to a joint unit and rotation. 

In summary, the matrices X^ and X^ are combined, through the 
operation , to produce the matrix 0^2* ^12' turn, is a 
least squares fit to l^^^, given that is^^ is perfectly monotonic 
with the data S^2- The monotonicity restraint may be either in- 
creasing or decreasing and is weak. The matrix D^^ is related 
to X^ by the operation g;/ and is related to X2 by the same 

operation ^. 

Generalized functio n. The second generalization of the non- 
metric model is to relax the function relating the matrix D of 
distances and X of coordinates. The revised function is denoted 
h for symmetiic cases, and is defined as 

h(X) = h^(h2(x^^ ,Xj ^)) , for i, j=l, 2, ...,p, (10) 

and for rectangular cases is denoted h* and is defined as 

i~ 1 2 ID 

h'(X) = h- (h'(x. ,x. )), for _ ' 1 , (11) 

j=l , 2 , • . . ,p2 

where the notation x. is used to indicate the entire i'th row 

1 . 

of X. 

The entire raodel for the polynomial conjoint analysis of 
similarities can be represented, for the symmetric case, by the 
equation 

♦ 

S S 4 = D =h(X) , (12) 

V 



(11) 

Where h is defined by equation (10). For the nonsymmetric case, 
the model is represented by the equation 



( 



^12 - ^2= ^12 = 



^1 
^2 



(13) 



where h' is defined by equation (11). it should be noted that 
the runction h' can also be applied to and in the rectangu- 



lar case, giving us 

^11 = ^'(^l) 



and 



= h'cx^) . (14J 

In summary, for symmetric analyses the matrix X contains 
the coordinates (or factors or dimensions, etc.) whose distances 
in the space defined by the function h best reproduce the order 
(or the inverse of the order) of the entries in the data matrix 
S. For rectangular analyses the matrix X^ contains the row 
coordinates (or row effects or row factors, etc.) and the matrix 
X2 contains the column coordinates (or column effects, or column 
factors, etc.) whose between-set .stances in the space defined 
by the function h' best reproduce the order (or the inverse of 
the ordGr) of the entries in the rectangular data matrix S^^. 

Specific Submodels of the General Model 

The function h relating the matrices D and X is too general 
to be of immediate interest. It is possible, however, to make 
specific assumptions concerning the functions h^ and h ^ , gener- 
ating what will be called specific submodels of the general model. 
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Some examples of a few familiar submodels are presented below, 

Euclidian scaling . The submodel for standard Euclidian 
nonmetric multidimensional scaling is obtained by assuming 

1/2 



and 



h2(Xi,,x.J = J/'^ia-^ja^ 



With these assumptions equation (10) becomes the familiar Euclid- 
ian distance function presented earlier as equation (2) . This 
submodel corresponds directly with one of the programs of Guttman 
and Lingoes (Lingoes, 1965), and with the program presented by 
McGee (1966) . 

Minkowski scaling ^ The submodel for nonmetric multidimen- 
sional scaling in any Minkowski space is provided by assuming 

1/c 



and 



h (X. ,x. ) = I |x. -X. I 
—2 1. 3. .'''la na' 



a=l 3^ 

where c is , as before, the Minkowski constant. With these as- 
sumpcions equation (10) becomes the Minkowski distance function 
presented as equation (7) . This submodel corresponds directly 
with the model proposed by Kruskal, and with the program pre- 
pared by Young and Torgerson (1967) . One of the important 
Minkowski spaces which has been used in psychological research 
is the city-block space corresponding with a Minkowski constant 
of 1. Attneave (1950) has reported some analyses using this 
space • 
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Multidimensional unfolding . The rectangular version of the 
Euclidian nonmetric multidiinensional scaling submodel corresponds 
with the multidimensional unfolding model proposed by Coombs 
(1964) . The submodel is obtained by assuming 

h[ (h^) = Ih^]^/^ 

and 

With these assumptions equation (11) becomes a Euclidian distance 
f'lnction between two sets of coordinates. This function is the 
oya proposed by Coombs, and corresponds with the program prepared 
by Lingoes (1966), and the program written by Young (1968a, 1968b) 
Minkowski unfolding . The rectangular version of the Minkow- 
ski nonmetric multidimc^nsional scaling submodel generates a model 
which would logically be called a Minkowski unfolding model. 
This submouel is obtained by assuming 

h^ (h') = [h^]^/^ 

and 

a=l 

The author is unaware of anyone having proposed this model, but 
the program by Young (196 8 a, 1968b) is capable of performing 
analyses based on this model. 

Dominance metric . In the area of discrimination and gener- 
alization several different models have been presented to account 
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for response generalization when the stimuli are multidimensional* 
Some of these models are discussed by Cross (1967) and, as he has 
pointed out, they correspond with differing Minkowski spaces. 
One of the "lodels correspond.^ with the Euclidian scaling model, 
and another corresponds with the city-block model discussed 
earlier. A third model, which Cross calls the dominance model, 
corresponds with a Minkowski space with infinite Minkowski con- 
stant. In a dominance space the distance between two points is 
defined as being equal to the largest of the absolute differences 
between the coordinates. In the terminology being used here, we 
would define the dominance submodel as 

and 

r 

h (x. ,x ) = max (|x -x |) 
J* a=l 

where the vertical lines indicate absolute value. No computa- 
tional method has been proposed for this model, to the knowledge 
of the author. However, with the Kruskal model, several avail- 
able programs will provide essentially equivaleiir results by 
using a very large number for the Minkowski constant. 

Conjoint measurement . Luce and Tukey (1964) have presented 
a powerful measurement model which they refer to as the conjoint 
measurement model. This is basically an additive model and it 
can be represented as the specific submodel 
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and 

(X. ,x. ) = x.^+x.^ 

Two programs exist which can perform analyses according to this 
model, one written by Tversky & Zivian (1966), and one by Lingoes 
(1968) . 

Polynomial conjoint measurement . A subset of the rnodels pro- 
posed by Tversky may be generated from our general mcdel by de- 
fining the submodel 

and 



hi (X ,x. ) = y (x. +x. ) 



where b and c are integer constants. In this case, equation (11) 
becomes 



d. . = 
ID 



r 

I (x. +x, )^ 
a=l 



(15) 



The submodel represented by equation (15) is actually a class of 
submodels, with different submodels generated by different sets 
of assumptions concerning the constants r, b, and c. A few 
examples follow. 

If we assume that r=l, b=l, andc=2, then we see that 
^ij = ^il' + 2x.^x.^ + x.^2 ^ 
which is simply the quadratic function of two variables. If we 
assumed that r = 1 , b = 1 , and c = 3 , then we would obtain the 
formula for the cubic function of two variables. If, on the 
other hand, we were to ^assume that r = 2, c = 1, and b = 2 , we 
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would obtain a complex power function of four variables. It 
should be clear that by the correct selection of the parameters 
b and £ we can determine the degree of the polynomial under con- 
sideration, and that by changing the value of r we can modify 
the number of variables in the equation. 

Nonmetric factor analysis . Several nonmetric analogs of 
factor analysis have been proposed (Shepard 1962; Lingoes, 1967b) 
One possible analog, differing from those presented earlier, 
will be presented here. This is specifically an analog of the 
Tucker and Messick points-of -view model (1963) as discussed by 
Cliff (1968) and Young and Pennell (1967). If one defines the 
submodel as 



and 



h' (X. ,x. ) = 

then equation (11) becomes 

r 



I (X. X. 

.=1 3. 



d. . = I (XX) 
1: a=l =1^ 

or, in matrix terms, 
D = X^X- 

This corresponds with the Tucker and Messick model which involvcis 
the matrix equation (using our symbols) 

D = x^rx- 

where r is a diagonal matrix of weights. 
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Subjective expect ed utility . According to the subjective 
expected utility model (Savage, 1954), when a subject chooses 
between two gambles he makes his choice by maximizing the sub- 
jective expected utility of the choices. The subjective ex- 
pected utility of a gamble is equal to the sum, over the var- 
ious choice objects, of the product of the utility of an outcome 
and its subjective probability of occurance. 

For this submodel one defines 
hi (hp = h^ , 

and 

r 

where there are r outcomes ' for each gamble, and where the x 

i . 

represent the utilities and the x^ the subjective probabilities. 
It should be obvious that the nonmetric analog of the factor 
analysis model and the subjective expected utility model are 
formally identical. 

Bradley-Terry-Luce choice model. This model (Luce, 1959) 
specifies the relation of choice probabilities when two choice 
objects are presented to the scale values of the objects. The 
model states that 

p(c,d) = ^ 

v(c) + v(d) 

where v(c) represents the scale values. The ordinal version of 
this model can be written 

p(c,d) < p(e,f) < = > v(c) - v(d) < v(e) - v(f) . 
In the terminology used. here, if we assume 
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h'(h') = h- , 



and 



then 



h'(x. ,x. ) = x,^ - x.^ , 



^ij = ^il - ^jl ' 



where ^ takes on the role of the choice probability between 
objects i and 2, and x^^ c.nd x ^ ^ are the scale values of those 
objects. It should be noticed that this is eqt valent to a one 
dimensional Minkowski metric. 



Conclusions 

On the basis of notions fundamental to nonmetric multidi- 
mensional scaling, a model has been developed which indicates 
a method for constructing algorithms for the polynomial con- 
joint analysis of similarities. It has been shown that this 
model includes, as special submodels, several of the common 
forms of nonmetric scaling, many of the forms of polynomial 
conjoint analysis, and several popular choice models. It 
should be obvious that, with the proper specification of the 
functional relationships indicated by equation (10) or (11), 
a great range of polynomial conjoint models is possible. 

Perhaps one of the major advantages of the model presented 
here is that it provides a means for minimizing the compJex 
functions represented by equations (10). through (13) • The 
iterative minimization algorithms used in nonmetric scaling may 
be applicable to this new model. In a subsequent paper, an 
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objective definition of what is meant by a "best solution" will 
be presented, along with a definition of a combination rule 
including a wide range of useful polynomial conjoint measurement 
sut>mcdels . 
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